feat(sinker): Ship 8 — Yuv444p RGBA via const-ALPHA template by al8n · Pull Request #19 · Findit-AI/colconv

al8n · 2026-04-26T05:31:58Z

Tranche 4a of Ship 8 sink-side RGBA. Refactors the Yuv444p planar 4:4:4 kernel family across all 6 backends (scalar + NEON + SSE4.1 + AVX2 + AVX-512 + wasm simd128) using the const-generic-ALPHA template established by PR #16 (Yuv420p) and extended in PR #17 (NV12/NV21).

Tranche 4 was split into three sub-PRs because each format family in it has a different shape:

4a (this PR) — Yuv444p (planar): own kernel family yuv_444_to_rgb_row, full refactor needed
4b (next) — Nv24 / Nv42 (semi-planar): shared <SWAP_UV, ALPHA> template, mirrors PR feat(sinker): Ship 8 — NV12 / NV21 RGBA via const-ALPHA template #17
4c (after) — Yuv440p (planar 4:4:0): wiring-only, reuses 4a's yuv_444_to_rgba_row

Scope

Sink-side only, default opaque alpha (0xFF). Per the tracker in docs/color-conversion-functions.md § Ship 8:

#	Tranche	Formats	Status
1	4:2:0 planar	`Yuv420p`	✅ shipped (PR #16)
2	4:2:0 semi-planar	`Nv12`, `Nv21`	✅ shipped (PR #17)
3	4:2:2 planar + semi-planar	`Yuv422p`, `Nv16`	✅ shipped (PR #18)
4a	4:4:4 planar	`Yuv444p`	⏳ this PR — kernel refactor across all 6 backends
4b	4:4:4 semi-planar	`Nv24`, `Nv42`	next — shared `<SWAP_UV, ALPHA>` template (mirrors PR #17 NV12/NV21)
4c	4:4:0 planar	`Yuv440p`	wiring-only after 4a (reuses `yuv_444_to_rgba_row`)
5	High-bit-depth 4:2:0	`Yuv420p9/10/12/14/16`, `P010/P012/P016`
6	High-bit-depth 4:2:2	`Yuv422p9/10/12/14/16`, `Yuv440p10/12`, `P210/P212/P216`
7	High-bit-depth 4:4:4	`Yuv444p9/10/12/14/16`, `P410/P412/P416`

Usage:

use colconv::{
    frame::Yuv444pFrame,
    sinker::MixedSinker,
    yuv::{Yuv444p, yuv444p_to},
    ColorMatrix,
};

let frame = Yuv444pFrame::new(&y_plane, &u_plane, &v_plane, w, h, w, w, w);
let mut rgba = vec![0u8; (w * h * 4) as usize];
let mut sinker = MixedSinker::<Yuv444p>::new(w as usize, h as usize)
    .with_rgba(&mut rgba)?;

yuv444p_to(&frame, /*full_range=*/ true, ColorMatrix::Bt709, &mut sinker)?;
// rgba[4*i..4*i+3] = R,G,B; rgba[4*i+3] = 0xFF.

What's in this PR

Public API

MixedSinker<Yuv444p>::with_rgba(&mut [u8]) / set_rgba — format-specific impl block.
row::yuv_444_to_rgba_row(...) — public dispatcher paralleling the RGB variant.

Kernel work

File	What's added
`row/scalar.rs`	`yuv_444_to_rgba_row` + shared `yuv_444_to_rgb_or_rgba_row<const ALPHA: bool>` template
`arch/neon.rs`	Same shape; uses native `vst4q_u8` when `ALPHA = true`, `vst3q_u8` otherwise
`arch/x86_sse41.rs`	Same shape; reuses `write_rgba_16` from PR #16
`arch/x86_avx2.rs`	Same shape; reuses `write_rgba_32` from PR #16
`arch/x86_avx512.rs`	Same shape; reuses `write_rgba_64` from PR #16
`arch/wasm_simd128.rs`	Same shape; reuses wasm `write_rgba_16` from PR #16

The 4:4:4 kernel is structurally simpler than 4:2:0 — one UV pair per Y pixel, no chroma upsampling — so the const-generic-ALPHA refactor is mechanical: only the per-block store branches on ALPHA. Each kernel has 3 wrappers now (yuv_444_to_rgb_row, yuv_444_to_rgba_row, yuv_444_to_rgb_or_rgba_row) thinning to the same monomorphized template.

MixedSinker integration

RGBA runs as an independent kernel call (not compose) — same pattern as Yuv420p (PR #16) and NV12/NV21 (PR #17). Default alpha = 0xFF since Yuv444p has no alpha plane.

Doc updates

docs/color-conversion-functions.md § Ship 8 — split tranche 4 into 4a (this PR — Yuv444p), 4b (Nv24 / Nv42), 4c (Yuv440p wiring).
The compile_fail doctest negative example on MixedSinker::<Yuv420p>::with_rgba moved forward from Yuv444p to Nv24 (next not-yet-wired format).

Tests

+6 lib tests, total 459 (was 453):

Layer	Tests added
Format-level Yuv444p	4: gray-to-gray + opaque alpha, RGB-byte invariant, buffer-too-short, random-YUV SIMD parity (1922×4 frame, all 4 matrices × both ranges)
NEON per-backend (verified locally)	2: 16-pixel all-matrices, varied widths (1, 3, 15, 17, 32, 33, 1920, 1921 — including odd widths to validate the 4:4:4 no-parity contract)
SSE4.1 per-backend (CI)	2: same shape
AVX2 per-backend (CI)	2: 32-pixel main loop + tail widths
AVX-512 per-backend (CI)	2: 64-pixel main loop + tail widths
wasm simd128 per-backend (CI)	2: 16-pixel + tail widths

Per-backend tests bypass the dispatcher (call each backend's unsafe yuv_444_to_rgba_row directly under runtime feature detection) so on AVX-512-capable CI runners all three x86 paths run; the existing CI matrix (avx512 SDE + AVX2-max + SSE4.1-max + scalar tarpaulin tiers) covers every backend.

Local results (aarch64 macOS): 459 lib tests + 1 doctest pass; wasm32 + x86_64 cross-targets compile clean.

What's deferred

Tranche 4b — Nv24 + Nv42 semi-planar 4:4:4 — next PR. Same dual-const-generic shape as PR feat(sinker): Ship 8 — NV12 / NV21 RGBA via const-ALPHA template #17 (NV12/NV21).
Tranche 4c — Yuv440p — wiring-only PR after 4a, reuses this PR's yuv_444_to_rgba_row (4:4:0 = 4:4:4 with half-height chroma).
Tranches 5–7 — high-bit-depth families.
with_rgba_u16 ships in tranches 5–7.
YUVA source frames (Ship 8b) — independent follow-up.

Test plan

CI green on test, test-sde-avx512, cross, coverage, clippy, build, miri-* jobs.
Per-tier coverage matrix exercises SSE4.1 / AVX2 / scalar paths via existing colconv_disable_* rustflags.
Verify Yuv444p → RGBA pipeline end-to-end with a real frame (gray + non-gray patches).
cargo doc --lib --no-deps clean (no new doc warnings vs. main).

🤖 Generated with Claude Code

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR adds native RGBA output support for YUV 4:4:4 planar (Yuv444p) across the MixedSinker path and row-kernel layer, including scalar and SIMD backends, plus tests to validate correctness and SIMD equivalence.

Changes:

Add yuv_444_to_rgba_row dispatcher and scalar/SIMD implementations (NEON, SSE4.1, AVX2, AVX-512, wasm simd128).
Extend MixedSinker<Yuv444p> with with_rgba/set_rgba and wire RGBA writing in PixelSink.
Add focused tests for Yuv444p RGBA output and SIMD-vs-scalar equivalence.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/sinker/mixed.rs	Adds RGBA buffer attachment for `Yuv444p` and emits RGBA during sink conversion + tests.
src/row/mod.rs	Exposes public `yuv_444_to_rgba_row` dispatcher with SIMD selection.
src/row/scalar.rs	Adds scalar RGBA kernel and refactors RGB/RGBA into a shared const-generic implementation.
src/row/arch/x86_sse41.rs	Adds SSE4.1 RGBA kernel and shared RGB/RGBA core + equivalence tests.
src/row/arch/x86_avx2.rs	Adds AVX2 RGBA kernel and shared RGB/RGBA core + equivalence tests.
src/row/arch/x86_avx512.rs	Adds AVX-512 RGBA kernel and shared RGB/RGBA core + equivalence tests.
src/row/arch/wasm_simd128.rs	Adds wasm simd128 RGBA kernel and shared RGB/RGBA core + equivalence tests.
src/row/arch/neon.rs	Adds NEON RGBA kernel and shared RGB/RGBA core + equivalence tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-26T05:40:23Z

  debug_assert!(y.len() >= width, "y row too short");
  debug_assert!(u.len() >= width, "u row too short");
  debug_assert!(v.len() >= width, "v row too short");
-  debug_assert!(rgb_out.len() >= width * 3, "rgb_out row too short");
+  let bpp: usize = if ALPHA { 4 } else { 3 };
+  debug_assert!(out.len() >= width * bpp, "out row too short for {bpp}bpp");


The debug_assert! message uses {bpp} formatting but does not pass bpp as an argument, which will fail to compile due to an unmatched named format parameter. Pass bpp to the macro (e.g., via positional or named formatting), or escape the braces if you intended them to be literal.

Copilot · 2026-04-26T05:40:23Z

+    // Native RGBA: independent kernel run, separate from RGB. Default
+    // alpha = 0xFF since Yuv444p has no alpha plane.
+    if let Some(buf) = rgba.as_deref_mut() {
+      let rgba_plane_end =
+        one_plane_end
+          .checked_mul(4)
+          .ok_or(MixedSinkerError::GeometryOverflow {
+            width: w,
+            height: h,
+            channels: 4,
+          })?;
+      let rgba_plane_start = one_plane_start * 4;
+      yuv_444_to_rgba_row(
+        row.y(),
+        row.u(),
+        row.v(),
+        &mut buf[rgba_plane_start..rgba_plane_end],
+        w,
+        row.matrix(),
+        row.full_range(),
+        use_simd,
+      );
+    }


When callers request both RGB and RGBA outputs, this design runs two independent YUV→RGB conversions (one for RGBA here and another later for RGB/HSV), duplicating the core per-pixel math. Consider a combined path when both rgb and rgba are set (single conversion loop that stores into both buffers), or preferentially compute one format and derive the other with minimal overhead, to avoid doubling the hottest work.

codecov · 2026-04-26T05:44:31Z

Codecov Report

❌ Patch coverage is 70.29703% with 60 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/row/arch/x86_avx512.rs	0.00%	34 Missing ⚠️
src/row/arch/neon.rs	80.55%	7 Missing ⚠️
src/row/mod.rs	72.00%	7 Missing ⚠️
src/row/arch/x86_avx2.rs	82.35%	6 Missing ⚠️
src/row/arch/x86_sse41.rs	82.35%	6 Missing ⚠️

📢 Thoughts on this report? Let us know!

) Tranche 4b of Ship 8 sink-side RGBA. Adds `Nv24` / `Nv42` (semi-planar 4:4:4) RGBA output via the dual-const-generic `<SWAP_UV, ALPHA>` template established by PR #17 (NV12 / NV21), and **retro-applies a Strategy A combined RGB→RGBA fan-out to all 8 wired families** so callers attaching both `with_rgb` and `with_rgba` no longer pay the per-pixel YUV→RGB math twice — addresses the Copilot review finding from PR #19 (`src/sinker/mixed.rs:1648`).

) Tranche 4c of Ship 8 sink-side RGBA. Wiring-only PR — adds `Yuv440p` (4:4:0 planar, 8-bit) RGBA output by reusing the `yuv_444_to_rgba_row` dispatcher that shipped in PR #19. **No new kernel code** anywhere in the crate; per-row math is identical to 4:4:4 (full-width chroma) — only the walker reads chroma row `r / 2`.

update

443efa4

al8n requested a review from Copilot April 26, 2026 05:32

al8n changed the title ~~update~~ feat(sinker): Ship 8 — Yuv444p RGBA via const-ALPHA template Apr 26, 2026

Copilot started reviewing on behalf of al8n April 26, 2026 05:39 View session

Copilot AI reviewed Apr 26, 2026

View reviewed changes

update

fa48ba1

uqio merged commit 1fa7e95 into main Apr 26, 2026
57 of 58 checks passed

uqio deleted the feat/ship8-rgba-yuv444p branch April 26, 2026 05:52

al8n mentioned this pull request Apr 26, 2026

feat(sinker): Ship 8 — Nv24/Nv42 RGBA + Strategy A RGB→RGBA fan-out #20

Merged

4 tasks

al8n mentioned this pull request Apr 26, 2026

feat(sinker): Ship 8 — Yuv440p RGBA wiring (reuses Yuv444p kernels) #22

Merged

4 tasks

al8n mentioned this pull request Apr 26, 2026

feat(row): Ship 8 — high-bit 4:2:0 RGBA scalar (SIMD lands in 5a/5b) #24

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sinker): Ship 8 — Yuv444p RGBA via const-ALPHA template#19

feat(sinker): Ship 8 — Yuv444p RGBA via const-ALPHA template#19
uqio merged 2 commits intomainfrom
feat/ship8-rgba-yuv444p

al8n commented Apr 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Uh oh!

codecov Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

al8n commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Scope

What's in this PR

Public API

Kernel work

MixedSinker integration

Doc updates

Tests

What's deferred

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

al8n commented Apr 26, 2026 •

edited

Loading

codecov Bot commented Apr 26, 2026 •

edited

Loading